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Abstract — Extreme Learning Machine (ELM) is an emerging 
learning paradigm for nonlinear regression problems and has 
shown its effectiveness in the machine learning community. 
An important feature of ELM is that the learning speed is 
extremely fast thanks to its random projection preprocessing 
step. This feature is taken advantage of in designing an online 
parameter estimation algorithm for nonlinear dynamic systems 
in this paper. The ELM type random projection and a nonlinear 
transformation in the hidden layer and a linear output layer 
is considered as a generalized model structure for a given 
nonlinear system and a parameter update law is constructed 
based on Lyapunov principles. Simulation results on a DC 
motor and Lorentz oscillator show that the proposed algorithm 
is stable and has improved performance over the online-learning 
ELM algorithm. 

I. INTRODUCTION 

System identification is the process of obtaining mathe- 
matical models of systems using input-output data. System 
identification is important in design and analysis of control 
systems when the development of a physics-based dynamical 
model is not trivial. Several algorithms for identification of a 
linear system exist [1], [2] but when the nonlinearity is of a 
higher order, the local linear assumption fails and it becomes 
important to develop nonlinear identification methods. There 
exist online identification algorithms for nonlinear systems 
as well. Since the underlying structure is not assumed for 
the nonlinear system, a neural network type model can be 
a good choice [3], [4] among others. Such algorithms rely 
on linearizing the basis functions to obtain the gradient of 
the output error with respect to the network parameters. 
Different from the previous approaches, this paper makes 
use of the recently developed Extreme Learning Machines 
(ELM) for mapping the system nonlinearity. By exploiting 
ELM's random projection preprocessing stage where the 
input data is projected onto a high dimensional space where 
the features can be mapped using a linear least squares 
method, the high speed learning of ELM is inherited in 
the proposed algorithm. Using a Lyapunov method, a stable 
parameter update law for nonlinear system identification has 
been developed for continuous time dynamic systems. 

II. EXTREME LEARNING MACHINES - A REVIEW 

Extreme Learning Machine (ELM) is an emerging learn- 
ing paradigm for multi-class classification and regression 
problems [5], [6]. The highlight of ELM compared to the 
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Fig. 1. ELM Model Structure. 



other state of the art methodologies like neural networks, 
support vector machines is that the training speed of ELM is 
extremely fast. The key enabler for ELM's training speed is 
the random assignment of input layer parameters which do 
not require adaptation to the data. In such a setup, the output 
layer parameters can be determined analytically using least 
squares. Some of the attractive features of ELM [5] are listed 
below 

1) ELM is an universal approximator 

2) ELM results in the smallest training error without 
getting trapped in local minima (better accuracy) 

3) ELM does not require iterative training (low computa- 
tional demand) 

4) ELM solution has the smallest norm of weights (better 
generalization) 

5) The minimum norm least square solution by ELM is 
unique. 

ELM is developed from a machine learning perspective 
and hence data observations are considered independent and 
identically distributed. Hence the observations are discrete 
and a dynamic system application may not be directly 
suitable as the data is connected in time. However, ELM 
can be applied for system identification in discrete time by 
using a series-parallel formulation [3]. A generic nonlinear 
identification using the nonlinear auto regressive model with 
exogenous input (NARX) is considered as follows 

y( k ) = /K&- 1 )) ■■,u(k-n u ),y(k-l), ..,y(k-n y )] (1) 

where u(k) £ M. Ud and y(k) £ M. Vd represent the inputs and 
outputs of the system respectively, k represents the discrete 
time index, /(.) represents the nonlinear function mapping 



specified by the model, n u , n y represent the number of past 
input and output samples required (order of the system) for 
prediction while Ud and y& represent the dimension of inputs 
and outputs respectively. 

A. Offline learning algorithm 

The input-output measurement sequence of system (fl} can 
be converted to the form of training data as required by ELM 



{(xi,yi), (xn,Vn)} G (X,y) 



(2) 



where X denotes the space of the input features (Here 
X = W ldn "+y dn y and y = W d ) and x represent the 
augmented input vector obtained by appending the input and 
output measurements from the system as follows 

x = [u(k - 1), ..,u(k - n u ),y(k - 1), ..,y(k - n y )] T (3) 

The ELM is an unified representation of single layer 
feed-forward networks (SLFN) and is given by (0]i where 
g represents the hidden layer activation function and W r , W 
represents the input and output layer parameters respectively. 



y = [g(Wjx + b r )] T W 



(4) 



The matrix W r consists of randomly assigned elements that 
maps the input vector to a high dimensional feature space 
while b r is a bias component assigned in a random manner 
similar to W r . The elements can be assigned based on 
any continuous random distribution [6] and remains fixed 
during training. The number of hidden neurons determine the 
dimension of the transformed feature space and the hidden 
layer is equipped with a nonlinear activation function similar 
to traditional neural network architecture. It should be noted 
that nonlinear regression using neural networks for instance, 
the input layer parameters W r and W are simultaneously 
adjusted during training. Since there is a nonlinear connec- 
tion between the two layers, iterative techniques are the 
only possible solution. ELM, however, avoids the iterative 
training as the input layer parameters are randomly selected 
[5]. Hence the training step of ELM reduces to finding a least 
squares solution to the output layer parameters W given by 
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where A represents the regularization coefficient, T represents 
the vector of outputs or targets and H the hidden layer output 
matrix as termed in literature (see Figure Q]). 

B. Online learning algorithm 

In the batch training mode (offline training), all the data 
is assumed to be present. However, for an online system 
identification problem, data is sampled continuously and is 
available one by one. Hence the sequential learning algorithm 
can be modified to perform identification. The ELM online 
sequential algorithm can be formulated as follows [7] 



As an initialization step, a set of data observations are 
required to initialize the Hq and Wq by solving 



miii{j|iWo-y || 2 + A 1 1 Wo || 2 } 

Wo 

H = (W?x + b ) T e 
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(7) 
(8) 



where hq and represents the number of data observations 
in the initialization step and the number of hidden neurons 
of the ELM model respectively. The solution Wq is given by 



Wo = K^HJYq 



(9) 



where Kq = HqHq. Suppose given another new data x\, 
the problem becomes 



mm 

Wi 



#1 



Wi - 



Y 



(10) 



The solution can be derived as 
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Based on the above, a generalized recursive algorithm for 
updating the least-squares solution can be computed as 
follows 

P k+ x=Pk-PkHl +1 {I + H k+ xP k Hl + xr 1 H k+ xP k (11) 



W k+1 = W k + P k+ xHl +1 {T k+ x - H k+1 W k ) 



(12) 



III. LYAPUNOV BASED PARAMETER UPDATE 
LAW 

The parameter update law is derived for a continuous 
time system. A general multi-input multi-output (MIMO) 
nonlinear dynamic system is given by 



z(t) = f(z(t),u(t)) 



(13) 



where the state vector z € M. nxl , input (or control) vector 
* 1 . By adding and subtracting Az(t) where A e 



u G 1 

TTiin x n 



is a Hurwitz matrix, then the system dT3l > becomes 

z(t) = Az(t)+g(z(t),u(t)) (14) 

where g(z(t),u(t)) = f(z(t),u(t)) — Az(t) describes the 
system nonlinearity. Assuming ELM can model the system 
nonlinearity g(z(t), u(t)) with an accuracy of e. If we assume 
bounded inputs and bounded states for the system ( fT3l . then 
e(t) for the model is finite and is bounded above by £ [5]. 
The system ( fPfl i can now be represented by 

z(t) = Az(t)+Wj4> + e{t) (15) 

The parametric model of the system can be considered as 



k{t) = Az{t) + W T cp 



(16) 



where W* and W represents the actual and estimated param- 
eters of the ELM model, (f> represents the hidden layer output 
of ELM (see Figure [T]). It should be noted that the input- 
hidden layer connection parameters W r has been chosen 
randomly and fixed assuming that ELM only needs tuning of 
the output layer weights W. Hence can be considered the 



same for both the system and the parametric model which is 
a simplification that has been achieved with the help of the 
ELM formulation. This simplicity cannot be achieved using 
traditional back-propagation neural networks and hence the 
strength of the proposed method. The estimation error and 
the error dynamics are given by 



e(t) = z — z 

i(t) = Ae(t) + (Wf - W T )<\> - 
= Ae(t) + W T 4> + e(t) 



it) 



(17) 

(18) 
(19) 



where W represents the parameter error. 

In order to have a stable parameter update law that 
guarantees convergence of both estimation error and the 
parametric error to zero, the following Lyapunov function 
is considered. 
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(23) 
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By applying the universal approximation capability of ELM, 
the approximation error e can be made arbitrarily small and 
hence V converges to zero. Hence with proper selection 
of the number of hidden neurons rih of ELM and with 
persistent excitation, both the estimation error e as well as 
the parameter error W can be made to converge to zero. It 
should be noted that as long as the estimation error is above 
r, the stability of the algorithm is guaranteed. The value 
of T can be chosen to be the required accuracy of ELM 
approximation [8], [4] so that the adaptation can occur as 
long as the model approximation error is greater than the 
required accuracy. 



Hence the parameter estimation algorithm based on Lya- 
punov analysis is given by 
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(24) 



IV. SIMULATIONS 



The two algorithms compared for the simulation study are 
the existing online ELM algorithm [7] and the proposed 
Lyapunov based ELM algorithm. For all the simulations, 
the same ELM model structure with the same randomly 
assigned input layer weights and biases (W r and b r ) as well 
as the same initial condition for output layer weights (Wo) 
are imposed. The design matrix A can also be appropriately 
chosen so as to suit the requirements on overshoot, settling 
time of the parameter estimation [8], [4]. 

It should be noted that the input layer parameters W r is 
fixed. It is required by ELM that all data is normalized 
to lie between -1 and +1 and hence appropriate scaling 
in introduced during simulation. The limits of the states 
and inputs are known a priori and can be used in the 
normalization. The inputs to the system has to be per- 
sistently exciting (as required for parameter convergence) 
which not easy to achieve in nonlinear systems. Hence the 
input signal follows a pseudo-random multi level sequence 
(PRMS) which represents several combination of step inputs 
at different magnitudes and frequencies suitable for exciting 
nonlinear systems [9]. 

A. DC motor example 

A nonlinear DC motor system is considered whose dy- 
namic equations are as follows 



where 



* = f(x) + g{x)u 

m = 



(25) 



-CiXi + c 3 
-c 4 x 2 



-C2X2 
-C5X1 



where ci=60, c 2 =0.5, c 3 =40, c 4 =6, c 5 =40000. The design 
matrix A is chosen as 



A = 



-50 
-50 



The number of hidden neurons for ELM model is chosen 
as 8. Sigmoidal activation function is considered as the input 
layer activation function. Two cases are compared - with and 
without gaussian noise at the measurement. The results are 
summarized in Figures for the case without noise and in 
Figures |5]|7] for the case with noise. The results of root mean 
squared error (RMSE) between the states of the actual and 
estimated system are compared in Table [I] 
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Fig. 2. Comparison of system states of actual and estimated system by 
Lyapunov ELM and Online ELM for DC motor system. 



Fig. 5. Comparison of system states of actual and estimated system 
by Lyapunov ELM and Online ELM for DC motor system with gaussian 
measurement noise. 
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Fig. 3. Convergence of error between the states of actual and estimated 
system by Lyapunov ELM and Online ELM for DC motor system. 



Fig. 6. Convergence of error between the states of actual and estimated 
system by Lyapunov ELM and Online ELM for DC motor system with 
gaussian measurement noise. 
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Fig. 4. Parametric Convergence (only few parameters shown) by Lyapunov 
ELM and Online ELM for DC motor system. 



Fig. 7. Parametric Convergence (only few parameters shown) by Lyapunov 
ELM and Online ELM for DC motor system with gaussian measurement 
noise. 



TABLE I 

Comparison of normalized RMSE of the error between the 

STATES OF THE NONLINEAR SYSTEM AND THE MODELS BY ONLINE 
ELM AND LYAPUNOV ELM FOR THE DC MOTOR SYSTEM 





Online ELM 


Lyapunov ELM 


normalized RMSE 


0.4635 


0.0935 


normalized RMSE (with noise) 


0.4626 


0.0936 



TABLE II 

Comparison of normalized RMSE of the error between the 

STATES OF THE NONLINEAR SYSTEM AND THE MODELS BY ONLINE 

ELM and Lyapunov ELM for the Lorentz system. 





Online ELM 


Lyapunov ELM 


normalized RMSE 


0.2085 


0.0652 


normalized RMSE (with noise) 


0.2424 


0.1139 
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Fig. 8. Comparison of system states of actual and estimated system by 
Lyapunov ELM and Online ELM for Lorentz system. 



B. Lorentz oscillator 

A chaotic dynamic system is a nonlinear deterministic 
system that displays nonlinear and unpredictable behavior. 
These systems are very sensitive to initial conditions and 
systems parameters behavior. One of the ways to represent 
a chaotic system is using Lorentz system whose dynamic 
equations are as follows 

x = o~(y — x) 
y = rx — y — xz 
i = xy — bz 

where a, r, b > are system parameters. For this simulation, 
<7=10, r=28 and 6=8/3 are considered. It should be noted that 
there are no excitation input to the system. 
The design matrix A is chosen as 
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-120 



The number of hidden neurons for ELM model is chosen 
as 12. Sigmoidal activation function is considered as the 
input layer activation function. Two cases are compared - 
with and without gaussian noise at the measurement. The 
results are summarized in Figures [8lfT0l for the case without 
noise and in Figures [TTIfT3l for the case with noise. The 
results of root mean squared error between the states of the 
actual and estimated system are compared in Table [TT] 

V. Discussion 

It can be observed from the simulation results that the 
proposed Lyapunov ELM algorithm is suited for nonlinear 
system identification and has performance better than a 
sequential learning online ELM algorithm. From Figures [3] 
and [9] it can be observed that the states of the system and the 
estimated model converge for both examples. From Figures 
H] [10] the convergence of model parameters can be seen but 
it is not guaranteed that the parameters converge to their 
true values as the model structure takes a general form and 
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Fig. 9. Convergence of error between the states of actual and estimated 
system by Lyapunov ELM and Online ELM for Lorentz system. 
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Fig. 10. Parametric Convergence (only few parameters shown) by Lya- 
punov ELM and Online ELM for Lorentz system. 
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Fig. 11. Comparison of system states of actual and estimated system 
by Lyapunov ELM and Online ELM for Lorentz system with gaussian 
measurement noise. 
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Fig. 12. Convergence of error between the states of actual and estimated 
system by Lyapunov ELM and Online ELM for Lorentz system with 
gaussian measurement noise. 
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independent of the actual system. The above are observed for 
the cases with measurement noise too. It can also be observed 
from Figures [4] and Figures [7] that parameter convergence 
may be faster for the Lyapunov ELM case compared to the 
online ELM algorithm. Also, parameter convergence appears 
to be monotonic for the Lyapunov ELM case. Finally, from 
Tables U and Q1] it can be observed that the Lyapunov ELM 
outperforms online ELM algorithm and achieves a better 
accuracy in terms of the estimated states. It should be noted 
that the design matrix A needs tuning depending on the 
nature of transient response in prediction. However it is 
straightforward as an decrease in the magnitude of the eigen 
values of A results in a faster tracking. This gives additional 
flexibility and control on the Lyapunov ELM's performance. 

VI. CONCLUSIONS 

An online system identification algorithm for nonlinear 
systems has been developed using a Lyapunov approach. 
The complexity of the proposed algorithm is similar to 
that of a linear parameter estimation thanks to the random 
preprocessing step featured by extreme learning machines. 
The proposed algorithm carries over the simplicity of ELM 
but performs better than the online version of ELM owing 
to the stability guarantee of Lyapunov's method. Simulation 
results on two examples prove the validity of the proposed al- 
gorithm. Future Work will focus on application to a complex 
real world nonlinear dynamic system and study convergence 
properties. 
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Fig. 13. Parametric Convergence (few parameters shown) by Lyapunov 
ELM and Online ELM for Lorentz system with gaussian measurement noise. 



